Skip to content

chore: add schema for UC Metric Views on Analytics plugin#429

Merged
atilafassina merged 7 commits into
mainfrom
mv-schema
Jun 10, 2026
Merged

chore: add schema for UC Metric Views on Analytics plugin#429
atilafassina merged 7 commits into
mainfrom
mv-schema

Conversation

@atilafassina

@atilafassina atilafassina commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

What

PR1 of the metric-views delivery stack: the schema contract for metric.json — the opt-in config that activates the analytics metric-view path. Ships in the shared package.

This is the typed contract only — no runtime, CLI, hook, UI, or demo. It lands inert: nothing executes until an app actually ships a config/queries/metric.json.

Why now / why it's safe

This re-ships part of #341 (a ~15k-line mega-PR) as a small, reviewable increment. The feature is opt-in and dormant (every metric-view code path gates on metric.json existing), and these are all new files — so it lands on main without touching any existing behavior.

What's in it — 4 files, +267

File Purpose
packages/shared/src/schemas/metric-source.ts Zod source of truth — metricSourceSchema + inferred MetricSource/MetricExecutor types
tools/generate-json-schema.ts extended (+13) to emit the JSON Schema artifact
docs/static/schemas/metric-source.schema.json generated, published JSON Schema (powers editor $schema autocomplete + validation)
packages/shared/src/schemas/metric-source.test.ts 13 safeParse validation cases

The metric.json contract

{
  "$schema": "https://databricks.github.io/appkit/schemas/metric-source.schema.json",
  "metricViews": {
    "revenue":   { "source": "main.analytics.revenue" },
    "my_orders": { "source": "main.sales.orders_by_user", "executor": "user" }
  }
}
  • Top-level { $schema?, metricViews? }, closed (rejects unknown keys).
  • metricViews is a single map of metric key → entry. One map (rather than per-executor sections) makes metric keys unique by construction — the route key space can't collide.
  • metric key: identifier pattern ^[a-zA-Z_][a-zA-Z0-9_]*$ — becomes the route key (POST /api/analytics/metric/:key), the useMetricView('<key>', …) argument, and the MetricRegistry augmentation key.
  • source: three-part Unity Catalog FQN <catalog>.<schema>.<metric_view>.
  • executor: "app_service_principal" (default) runs as the app service principal with a shared cache; "user" runs on-behalf-of the requesting user with a per-user cache. Defaulting to app_service_principal matches plain <key>.sql queries executing as SP.
  • Entries are objects (not bare strings) so future per-entry options (cacheTtl, defaultFilter, allowlists) are additive — executor is the first such option.

Shape revision (review feedback): the original proposal used top-level sp/obo lane sections, mirroring the <key>.sql / <key>.obo.sql query-file convention. Per review discussion this was reshaped to the entity-first metricViews map with a per-entry executor — clearer naming, execution mode as an attribute rather than a taxonomy, and the cross-lane duplicate-key rule (previously unexpressible in the schema, enforced post-parse) becomes unrepresentable.

Reconciliation note (for reviewers comparing against #341)

#341 authored this JSON-Schema-first (hand-written .schema.json → generated .ts via a generate-schema-types.ts tool). Since #341's base, main adopted a Zod-first convention in the manifest refactor (#261): Zod is the single source of truth and the JSON Schema is generated from it via tools/generate-json-schema.ts. This PR re-expresses the contract in main's current idiom. Consequences vs. the #341 reference:

  • No *.generated.ts — the inferred MetricSource type replaces it.
  • No ajv / ajv-formats — validation rides the Zod schema directly via safeParse, matching validate-manifest.ts.
  • No package-internal .schema.json — the generated JSON lives only in docs/static/schemas/, exactly like the manifest schemas.

Note the shape also differs from #341 (see revision note above): downstream slices (runtime, typegen, CLI) port their config-reading layer against this revised contract.

Testing

  • pnpm build && pnpm docs:build
  • pnpm check:fix && pnpm -r typecheck
  • pnpm test ✅ — metric-source.ts at 100% coverage. The generator is deterministic (re-running produces no diff) and does not drift the existing manifest/template schemas. Tests include rejection of the legacy sp/obo shape and verification that the executor default materializes on parse.

Stack context

Part of re-shipping #341 as a stacked chain (merge order PR0 → PR1 → PR3 → PR4 → PR2 → PR5 → PR6):

Only hard dependency on this PR: PR4 (appkit metric sync CLI) imports this schema and will validate via the Standard Schema interface (no Ajv).


This pull request and its description were written by Isaac.

@atilafassina atilafassina changed the title feat(shared): metric.json schema (Zod source + generated JSON Schema) chore: add schema for UC Metric Views on Analytics plugin Jun 9, 2026
@atilafassina atilafassina marked this pull request as ready for review June 9, 2026 15:36
@atilafassina atilafassina requested a review from a team as a code owner June 9, 2026 15:36
…hema)

Author the metric.json config contract as Zod in
packages/shared/src/schemas/metric-source.ts (single source of truth)
and generate the published JSON Schema via tools/generate-json-schema.ts
into docs/static/schemas/metric-source.schema.json. metric.json declares
the Unity Catalog Metric View sources (sp/obo lanes) that opt an app into
the analytics metric-view path.

Reconciles PR1 of the metric-views stack onto main's Zod-first schema
convention; #341 authored this JSON-first with a separate generated type.

Co-authored-by: Isaac
Signed-off-by: Atila Fassina <atila@fassina.eu>
Validate metricSourceSchema directly via safeParse (no Ajv): accepts
sp-only / mixed sp+obo / empty configs; rejects bare-string entries,
missing source, unknown entry and top-level fields, invalid metric keys
(leading digit, hyphen), and malformed source FQNs. Ports the #341 case
set to main's Zod-first validation idiom.

Co-authored-by: Isaac
Signed-off-by: Atila Fassina <atila@fassina.eu>
Trim the module header, move the object-entry rationale to a @note on
metricEntrySchema, and drop the section banners. Comment-only — the Zod
schema, describe() strings, and generated JSON are unchanged.

Co-authored-by: Isaac
Signed-off-by: Atila Fassina <atila@fassina.eu>
@atilafassina atilafassina changed the title chore: add schema for UC Metric Views on Analytics plugin chore(shared): metric.json schema (Zod source + generated JSON Schema) Jun 9, 2026
@atilafassina atilafassina changed the title chore(shared): metric.json schema (Zod source + generated JSON Schema) chore: add schema for UC Metric Views on Analytics plugin Jun 9, 2026

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds the initial schema contract for config/queries/metric.json (authored in Zod and emitted as JSON Schema) to support the upcoming analytics metric-view stack.

Changes:

  • Introduces a new Zod schema (metricSourceSchema) that defines the allowed shape for metric.json (closed top-level object with optional $schema, sp, obo).
  • Adds Vitest coverage for accepted/rejected configurations (metric key pattern, strict objects, 3-part UC FQN).
  • Extends the JSON Schema generation script to emit and publish metric-source.schema.json under docs/static/schemas.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File Description
tools/generate-json-schema.ts Emits and writes the generated JSON Schema for the new metric-source Zod schema.
packages/shared/src/schemas/metric-source.ts Defines the Zod source-of-truth schema and inferred TS types for metric.json.
packages/shared/src/schemas/metric-source.test.ts Adds schema validation tests (valid/invalid configs).
docs/static/schemas/metric-source.schema.json Generated draft-07 JSON Schema artifact served by docs for editor $schema validation.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread packages/shared/src/schemas/metric-source.ts
Comment thread packages/shared/src/schemas/metric-source.test.ts Outdated
The SP-only case carried an explicit obo: {} (ported verbatim from the
reference); empty-lane coverage already lives in the empty-configuration
case. Review feedback on #429.

Co-authored-by: Isaac
Signed-off-by: Atila Fassina <atila@fassina.eu>

@calvarjorge calvarjorge left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this was already discussed and I missed it, but I don't fully agree with having the sp/obo as the top key. Basically, currently, it would be:

{
    "sp":  { "revenue":   { "source": "main.analytics.revenue" } },
    "obo": { "my_orders": { "source": "main.sales.orders_by_user" } }
 }

I'd prefer:

{
    "metrics": {
      "revenue":   { "source": "main.analytics.revenue",        "execute_as": "app_service_principal" },
      "my_orders": { "source": "main.sales.orders_by_user",     "execute_as": "user" }
    }
 }

The reasons why I believe this would be more intuitive are:

  • The query execution principal is not that important of a concept to have it as the first-level key. In fact, maybe many users don't even care (and we can just use a default for them?).
  • Having it as a key means that it's hard to even know what it is. IMO keys in an object should represent entities of the parent object.
  • Naming of sp / obo is quite hard to understand. I'd prefer full text (at least accept it), ie, service_principal and on_behalf_of, or ideally app_service_principal and user to be more specific.

…xecutor

Replace the sp/obo lane sections with a single 'metrics' map; the
execution principal moves into each entry as 'executor'
("service_principal" | "user"), defaulting to service_principal —
consistent with plain .sql queries executing as SP.

Entity-first also makes metric keys unique by construction: the same key
can no longer be declared in two lanes, so the cross-lane duplicate rule
(previously unexpressible in the schema and enforced post-parse) becomes
unrepresentable.

Review feedback from calvarjorge on #429.

Co-authored-by: Isaac
Signed-off-by: Atila Fassina <atila@fassina.eu>
@atilafassina

atilafassina commented Jun 10, 2026

Copy link
Copy Markdown
Contributor Author

@calvarjorge agreed and adopted in 917242a (value renamed to app_service_principal in 5fc4a00) — the schema is now an entity-first metrics map with the execution principal as a per-entry attribute. Two naming tweaks from your sketch: the field is executor and the values are "app_service_principal" | "user", defaulting to app_service_principal (consistent with plain <key>.sql queries executing as SP, for the users who don't care).

{
  "metrics": {
    "revenue":   { "source": "main.analytics.revenue" },
    "my_orders": { "source": "main.sales.orders_by_user", "executor": "user" }
  }
}

Bonus your shape bought us: with a single map, metric keys are unique by construction — the old two-lane shape could express the same key in both sp and obo, which the schema couldn't forbid (cross-record key disjointness isn't expressible in JSON Schema), so the reference implementation enforced it post-parse in two separate places. That rule is now unrepresentable and both checks disappear from the downstream slices.

…_principal

More specific about whose service principal executes the query — the
app's. Bare service_principal is now a rejected value.

Co-authored-by: Isaac
Signed-off-by: Atila Fassina <atila@fassina.eu>

@calvarjorge calvarjorge left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good - only one last thing: the concept of metric is quite broad. Maybe we want to be more specific and refer to this as metric_view? Specially if we might have other concepts similar to metrics in the future (telemetry metrics, etc.)

The entries are UC metric views, not generic metrics — the key now says
so. camelCase per the repo's authored-config-key convention (manifest
keys like displayName/dependsOn; snake_case is reserved for values and
fields mirroring Databricks APIs).

Co-authored-by: Isaac
Signed-off-by: Atila Fassina <atila@fassina.eu>
@atilafassina

Copy link
Copy Markdown
Contributor Author

Follow-up on the shape discussion: renamed the root key metricsmetricViews in aa5d7fb — these entries are specifically UC metric views, and the bare metrics name collided with the observability sense of the word. camelCase per the repo's authored-config-key convention (appkit.plugins.json's displayName/dependsOn/…); snake_case stays reserved for values mirroring Databricks APIs (app_service_principal, sql_warehouse).

{
  "metricViews": {
    "revenue":   { "source": "main.analytics.revenue" },
    "my_orders": { "source": "main.sales.orders_by_user", "executor": "user" }
  }
}

@atilafassina atilafassina merged commit 79c5055 into main Jun 10, 2026
9 of 11 checks passed
@atilafassina atilafassina deleted the mv-schema branch June 10, 2026 14:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants